Building quantitative prediction models for tissue residue of two explosives compounds in earthworms from microarray gene expression data.

نویسندگان

  • Ping Gong
  • Po-Ru Loh
  • Natalie D Barker
  • George Tucker
  • Nan Wang
  • Chenhua Zhang
  • B Lynn Escalon
  • Bonnie Berger
  • Edward J Perkins
چکیده

Soil contamination near munitions plants and testing grounds is a serious environmental concern that can result in the formation of tissue chemical residue in exposed animals. Quantitative prediction of tissue residue still represents a challenging task despite long-term interest and pursuit, as tissue residue formation is the result of many dynamic processes including uptake, transformation, and assimilation. The availability of high-dimensional microarray gene expression data presents a new opportunity for computational predictive modeling of tissue residue from changes in expression profile. Here we analyzed a 240-sample data set with measurements of transcriptomic-wide gene expression and tissue residue of two chemicals, 2,4,6-trinitrotoluene (TNT) and 1,3,5-trinitro-1,3,5-triazacyclohexane (RDX), in the earthworm Eisenia fetida. We applied two different computational approaches, LASSO (Least Absolute Shrinkage and Selection Operator) and RF (Random Forest), to identify predictor genes and built predictive models. Each approach was tested alone and in combination with a prior variable selection procedure that involved the Wilcoxon rank-sum test and HOPACH (Hierarchical Ordered Partitioning And Collapsing Hybrid). Model evaluation results suggest that LASSO was the best performer of minimum complexity on the TNT data set, whereas the combined Wilcoxon-HOPACH-RF approach achieved the highest prediction accuracy on the RDX data set. Our models separately identified two small sets of ca. 30 predictor genes for RDX and TNT. We have demonstrated that both LASSO and RF are powerful tools for quantitative prediction of tissue residue. They also leave more unknown than explained, however, allowing room for improvement with other computational methods and extension to mixture contamination scenarios.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of blood cancer using leukemia gene expression data and sparsity-based gene selection methods

Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsity-based gene selection method. Materials and Methods: In this descriptive study, the microarray ...

متن کامل

Bioinformatics-Based Prediction of FUT8 as a Therapeutic Target in Estrogen Receptor-Positive Breast Cancer

Abstract Introduction: Estrogen receptor-positive (ER-positive) breast cancer is a subgroup of breast tumors that is more likely to respond to hormone therapy. ER-positive and ER- negative breast cancers tend to show different patterns of metastasis because of different signaling cascade and genes that are activated by estrogen response. Genetic factors can contribute to high rates of metastas...

متن کامل

Bioinformatics-Based Prediction of FUT8 as a Therapeutic Target in Estrogen Receptor-Positive Breast Cancer

Abstract Introduction: Estrogen receptor-positive (ER-positive) breast cancer is a subgroup of breast tumors that is more likely to respond to hormone therapy. ER-positive and ER- negative breast cancers tend to show different patterns of metastasis because of different signaling cascade and genes that are activated by estrogen response. Genetic factors can contribute to high rates of metastas...

متن کامل

O-3: Drug Repositioning by Merging Gene Expression Data Analysis and Cheminformatics Target Prediction Approaches

The transcriptional responses of drug treatments combined with a protein target prediction algorithm was utilised to associate compounds to biological genomic space. This enabled us to predict efficacy of compounds in cMap and LINCS against 181 databases of diseases extracted from GEO. 18/30 of top drugs predicted for leukemia (e.g. Leflunomide and Etoposide) and breast cancer (e.g. Tamoxifen a...

متن کامل

Feature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine

We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Environmental science & technology

دوره 46 1  شماره 

صفحات  -

تاریخ انتشار 2012